智能论文笔记

Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection

Simin Li , Huangxinxin Xu , Jiakai Wang , Aishan Liu , Fazhi He , Xianglong Liu , Dacheng Tao

分类：计算机视觉 | 人工智能

2022-08-23

数十亿人每天都在社交媒体上分享他们的日常生活图像。但是，它们的生物识别信息（例如，指纹）可以很容易地从这些图像中偷走。从社交媒体上泄漏的指纹泄漏的威胁引起了人们对匿名分享图像的强烈渴望，同时保持图像质量，因为指纹充当了终生的个体生物识别密码。为了防止指纹泄漏，通过在图像上添加不可察觉的扰动来作为解决方案出现。但是，现有作品要么在黑盒可传输性方面弱，要么显得不自然。由视觉感知层次结构激励（即，高级感知利用模型共享的语义，这些语义在模型中很好地转移，而低水平的感知提取物则是原始刺激的，并且会引起高视觉敏感性的刺激），我们提出了一个层次的感知噪声，注射框架以解决上述问题。对于黑盒可传递性，我们在指纹方向场上注入保护性噪声，以扰动模型共享的高级语义（即指纹脊）。考虑到视觉自然性，我们通过正规化侧向基因核的响应来抑制低级局部对比度刺激。我们的Fingersafe是第一个在数字（最高94.12％）和现实的场景（Twitter和Facebook，高达68.75％）中提供可行的指纹保护的人。我们的代码可以在https://github.com/nlsde-safety-team/fingersafe上找到。

translated by 谷歌翻译

Safety Index Synthesis via Sum-of-Squares Programming

Weiye Zhao , Tairan He , Tianhao Wei , Simin Liu , Changliu Liu

分类：机器人

2022-09-19

控制系统通常需要满足严格的安全要求。安全指数提供了一种方便的方法来评估系统的安全水平并得出所得的安全控制策略。但是，在控制范围内设计安全指数功能是困难的，需要大量的专家知识。本文提出了一个框架，用于使用方案总和编程合成通用控制系统的安全指数。我们的方法是表明，确保对安全设置边界的安全控制的非空缺等同于当地的多种积极问题。然后，我们证明了这个问题等同于通过代数几何形状的Pitivstellensatz进行编程。我们验证具有不同自由度和地面车辆的机器人臂上的拟议方法。结果表明，合成的安全指数可确保安全性，即使在高维机器人系统中，我们的方法也有效。

translated by 谷歌翻译

PIAT: Physics Informed Adversarial Training for Solving Partial Differential Equations

Simin Shekarpaz , Mohammad Azizmalayeri , Mohammad Hossein Rohban

分类：机器学习

2022-07-14

在本文中，我们提出了用于求解非线性微分方程（NDE）的神经网络的物理知情训练（PIAT）。众所周知，神经网络的标准培训会导致非平滑函数。对抗训练（AT）是针对对抗攻击的既定防御机制，这也可能有助于使解决方案平滑。 AT包括通过扰动增强训练迷你批量，使网络输出不匹配所需的输出对手。与正式AT仅依靠培训数据不同，在这里，我们使用对抗网络体系结构中的自动差异来以非线性微分方程的形式编码管理物理定律。我们将PIAT与PIAT进行了比较，以指示我们方法在求解多达10个维度方面的有效性。此外，我们提出了重量衰减和高斯平滑，以证明PIAT的优势。代码存储库可从https://github.com/rohban-lab/piat获得。

translated by 谷歌翻译

Self-Supervised Deep Subspace Clustering with Entropy-norm

Guangyi Zhao , Simin Kou , Xuesong Yin

分类：计算机视觉

2022-06-10

基于自动编码器的深度子空间聚类（DSC）广泛用于计算机视觉，运动分割和图像处理。但是，它在自我表达的矩阵学习过程中遇到了以下三个问题：由于简单的重建损失，第一个对于学习自我表达权重的信息较小；第二个是与样本量相关的自我表达层的构建需要高计算成本。最后一个是现有正规化条款的有限连接性。为了解决这些问题，在本文中，我们提出了一个新颖的模型，名为“自我监督的深度”子空间聚类（S $^{3} $ CE）。具体而言，S $^{3} $ CE利用了自我监督的对比网络，以获得更加繁荣的特征向量。原始数据的局部结构和密集的连接受益于自我表达层和附加熵 - 标准约束。此外，具有数据增强的新模块旨在帮助S $^{3} $ CE专注于数据的关键信息，并通过光谱聚类来提高正面和负面实例的聚类性能。广泛的实验结果表明，与最先进的方法相比，S $^{3} $ CE的出色性能。

translated by 谷歌翻译

Learning to Reverse DNNs from AI Programs Automatically

Simin Chen , Hamed Khanpour , Cong Liu , Wei Yang

分类：机器学习 | 人工智能

2022-05-20

With the privatization deployment of DNNs on edge devices, the security of on-device DNNs has raised significant concern. To quantify the model leakage risk of on-device DNNs automatically, we propose NNReverse, the first learning-based method which can reverse DNNs from AI programs without domain knowledge. NNReverse trains a representation model to represent the semantics of binary code for DNN layers. By searching the most similar function in our database, NNReverse infers the layer type of a given function's binary code. To represent assembly instructions semantics precisely, NNReverse proposes a more fine-grained embedding model to represent the textual and structural-semantic of assembly functions.

translated by 谷歌翻译

Informal Persian Universal Dependency Treebank

Roya Kabiri , Simin Karimi , Mihai Surdeanu

分类：自然语言处理

2022-01-10

本文介绍了正式和非正式波斯之间的语音，形态和句法区别，表明这两个变体具有根本差异，不能仅归因于发音差异。鉴于非正式波斯展出特殊的特征，任何在正式波斯语上培训的计算模型都不太可能转移到非正式的波斯，所以需要为这种品种创建专用的树木银行。因此，我们详细介绍了开源非正式波斯普通依赖性TreeBank的开发，这是一个在通用依赖性方案中注释的新的TreeBank。然后，我们通过在现有的正式树木银行上培训两个依赖性解析器并在域名数据上进行评估，调查非正式波斯的解析，即我们非正式树木银行的开发集。我们的结果表明，当我们穿过两个域时，解析器在跨越两个域时遇到了实质性的性能下降，因为它们面临更为不知名的令牌和结构，并且无法概括。此外，性能恶化的依赖关系最多代表了非正式变体的独特属性。这项研究的最终目标表明更广泛的影响是提供踩踏石头，以揭示语言的非正式变种的重要性，这被广泛地忽略了跨语言的自然语言处理工具。

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

Backdoor Attacks Against Dataset Distillation

Yugeng Liu , Zheng Li , Michael Backes , Yun Shen , Yang Zhang

分类：机器学习

2023-01-03

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

translated by 谷歌翻译

Language Models are Drummers: Drum Composition with Natural Language Pre-Training

Li Zhang , Chris Callison-Burch

分类：自然语言处理

2023-01-03

Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments. To tackle this issue, we present ongoing work and preliminary findings on the possibility for deep models to transfer knowledge from language to music, by finetuning large language models pre-trained on a massive text corpus on only hundreds of MIDI files of drum performances. We show that by doing so, one of the largest, state-of-the-art models (GPT3) is capable of generating reasonable drum grooves, while models that are not pre-trained (Transformer) shows no such ability beyond naive repetition. Evaluating generated music is a challenging task, more so is evaluating drum grooves with little precedence in literature. Hence, we propose a tailored structural evaluation method and analyze drum grooves produced by GPT3 compared to those played by human professionals, exposing the strengths and weaknesses of such generation by language-to-music transfer. Our findings suggest that language-to-music transfer learning with large language models is viable and promising.

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译